Sandwich algorithms for Bayesian variable selection

نویسندگان

  • Joyee Ghosh
  • Aixin Tan
چکیده

Markov chain Monte Carlo (MCMC) algorithms have greatly facilitated the popularity of Bayesian variable selection and model averaging in problems with high-dimensional covariates where enumeration of the model space is infeasible. A variety of such algorithms have been proposed in the literature for sampling models from the posterior distribution in Bayesian variable selection. Ghosh and Clyde proposed a method to exploit the properties of orthogonal design matrices. Their data augmentation algorithm scales up the computation tremendously compared to traditional Gibbs samplers, and leads to the availability of Rao– Blackwellized estimates of quantities of interest for the original non-orthogonal problem. The algorithm has excellent performance when the correlations among the columns of the design matrix are small, but empirical results suggest that moderate to strong multicollinearity leads to slow mixing. This motivates the need to develop a class of novel sandwich algorithms for Bayesian variable selection that improves upon the algorithm of Ghosh and Clyde. It is proved that the Haar algorithm with the largest group that acts on the space of models is the optimum algorithm, within the parameter expansion data augmentation (PXDA) class of sandwich algorithms. The result provides theoretical insight but using the largest group is computationally prohibitive so two new computationally viable sandwich algorithms are developed, which are inspired by the Haar algorithm, but do not necessarily belong to the class of PXDA algorithms. It is illustrated via simulation studies and real data analysis that several of the sandwich algorithms can offer substantial gains in the presence of multicollinearity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bounding the False Discovery Rate in Local Bayesian Network Learning

Modern Bayesian Network learning algorithms are timeefficient, scalable and produce high-quality models; these algorithms feature prominently in decision support model development, variable selection, and causal discovery. The quality of the models, however, has often only been empirically evaluated; the available theoretical results typically guarantee asymptotic correctness (consistency) of t...

متن کامل

Bayesian forecasting with highly correlated predictors

This paper considers Bayesian variable selection in regressions with a large number of possibly highly correlated macroeconomic predictors. I show that by acknowledging the correlation structure in the predictors can improve forecasts over existing popular Bayesian variable selection algorithms.

متن کامل

Bayesian Variable Selection with Related Predictors

In data sets with many predictors, algorithms for identifying a good subset of predic-tors are often used. Most such algorithms do not account for any relationships between predictors. For example, stepwise regression might select a model containing an interaction AB but neither main eeect A or B. This paper develops mathematicalrepresentations of this and other relations between predictors, wh...

متن کامل

Variable Selection using MM Algorithms.

Variable selection is fundamental to high-dimensional statistical modeling. Many variable selection techniques may be implemented by maximum penalized likelihood using various penalty functions. Optimizing the penalized likelihood function is often challenging because it may be nondifferentiable and/or nonconcave. This article proposes a new class of algorithms for finding a maximizer of the pe...

متن کامل

Variable Selection by Perfect Sampling

Variable selection is very important in many fields, and for its resolution many procedures have been proposed and investigated. Among them are Bayesian methods that use Markov chain Monte-Carlo (MCMC) sampling algorithms. A problem with MCMC sampling, however, is that it cannot guarantee that the samples are exactly from the target distributions. This drawback is overcome by related methods kn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 81  شماره 

صفحات  -

تاریخ انتشار 2015